Reducing Latency and Bandwidth Costs in Parallel Sparse Linear Solvers

نویسندگان

  • Oguz Selvitopi
  • Cevdet Aykanat
چکیده

Parallelizing sparse irregular application on distributed memory systems poses serious scalability challenges due to the communication bottlenecks which manifest themselves in an unpredictable manner as high bandwidth and/or latency overhead. The importance of different components of overall communication cost can be disproportionate due to the irregularity and sparseness inherent in the application. In such conditions, the best strategy for reducing communication overheads should favor the metric that is most crucial for the performance and a general method that attributes same importance to all metrics is likely to suffer. This work takes on the communication challenges offered by the latency-bound irregular applications, i.e., the applications characterized with high number of average and/or maximum messages per processor. The basic idea of our approach is to impose a regular communication pattern onto otherwise irregular communication operations and in this way to provide a low upper bound on the maximum number of messages handled by a processor. Using a regular communication pattern eliminates the irregularity in latency-bound communication operations and necessitates a store-and-forward scheme that consists of multiple stages of communication. Our findings show that the proposed approach is a remedy for the latency-bound applications; it scales seemingly unscalable instances and leads to an average of 50% reduction in parallel runtime on 256 processors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reducing latency cost in 2D sparse matrix partitioning models

Sparse matrix partitioning is a common technique used for improving performance of parallel linear iterative solvers. Compared to solvers used for symmetric linear systems, solvers for nonsymmetric systems offer more potential for addressing different multiple communication metrics due to the flexibility of adopting different partitions on the input and output vectors of sparse matrix-vector mu...

متن کامل

Reducing Synchronization Overheads in CG-type Parallel Iterative Solvers by Embedding Point-to-point Communications into Reduction Operations

Parallel iterative solvers are widely used in solving large sparse linear systems of equations on large-scale parallel architectures. These solvers generally contain two different types of communication operations: point-topoint (P2P) and global collective communications. In this work, we present a computational reorganization method to exploit a property that is commonly found in Krylov subspa...

متن کامل

Reducing Communication Costs Associated with Parallel Algebraic Multigrid

Algebraic multigrid (AMG) is an iterative method for solving sparse linear systems of equations (Ax̂ = b), such as discretized partial differential equations arising in various fields of science and engineering. AMG is considered an optimal solver, requiring only O(n) operations to solve a system of n unknowns. Standard computers contain neither the memory nor computing power to solve increasing...

متن کامل

Parallel Solution of Space Linear Systems

Many simulations in science and engineering give rise to sparse linear systems of equations. It is a well known fact that the cost of the simulation process is almost always governed by the solution of the linear systems especially for largescale problems. The emergence of extreme-scale parallel platforms, along with the increasing number of processing cores available on a single chip pose sign...

متن کامل

A priori power estimation of linear solvers on multi-core processors

High-performance computing (HPC) centres simulate complex scientific models which provide vital understanding of our world. In the recent years, power efficiency has become a critical aspect in the new HPC facilities because of high energy consumption costs. In this work, we present our study on power consumption of linear solvers on modern multi-core CPUs which are widely used in many scientif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017